Does Single Family Zoning correlate with greater segregation?

The dissimilarity index was calculated at a block group level. The Future Land Use Layer was used to represent zoning with the field Max Dwelling Units per Acre. We are using this field to define single family zoning. The analysis found there is some correlation between single-family zoning and segregation. There is also a relationship between Max Dwelling Units and segregation over the entire range of Max DU. Lower Max Dwelling Units are associated with more white people, and Higher Max Dwelling Units are associated with more People of Color. The differing ranges of Max DUs per acre across the region makes it difficult to perform the analysis. I couldn’t easily find a way to communicate the info I uncovered to a policy board level audience.

From a technical standpoint, we are looking into:

Does the Maximum Dwelling Units per Acre in a Block Group correlate with the Dissimilarity Index of the Block Group?

The segregation measure is a set of dissimilarity indices between racial groups. Between any two groups a measure of dissimilarity is calculated for each block group. More documentation can be found here: https://www.censusscope.org/about_dissimilarity.html

Because the future land use layer and the block groups do not match up geographically, I needed to do an aggregation to get a one to one match between the future land use and the block groups. For this, I found the max, min, and mean maximum du for a block group for all intersecting flu areas.

Then I looked into which aggregation max, min or mean of the max_du had the most correlation with the dissimilarity index; this will help us select which aggregation is the best for further analysis.

## Warning in cor.test.default(flu_dissim$White_Minority_Dissim,
## flu_dissim$mean_max_du, : Cannot compute exact p-value with ties
## 
##  Spearman's rank correlation rho
## 
## data:  flu_dissim$White_Minority_Dissim and flu_dissim$mean_max_du
## S = 4149597802, p-value < 2.2e-16
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
##        rho 
## -0.3531487

The most correlation was observed between the dissimilarity index and the aggregation to the block group by mean, so the rest of the analysis will use the mean max_du_ac by block group from the flu layer.

Let’s analyze the distributions of the mean_max_du and the dissimilarity indices.

## flu_dissim$mean_max_du 
##        n  missing distinct     Info     Mean      Gmd      .05      .10 
##     2640        0     2034        1    24.05    29.03    0.150    1.565 
##      .25      .50      .75      .90      .95 
##    4.896   12.121   29.624   55.911   73.238 
## 
## lowest :   0.02500000   0.04000000   0.04375000   0.04878049   0.05000000
## highest: 375.66144000 378.97200000 379.72000000 413.82000000 472.62600000
## flu_dissim$White_Minority_Dissim 
##         n   missing  distinct      Info      Mean       Gmd       .05       .10 
##      2640         0      2637         1 0.0006692     3.779   -6.6119   -4.7278 
##       .25       .50       .75       .90       .95 
##   -1.8155    0.7146    2.4032    3.6327    4.4206 
## 
## lowest : -18.734869 -17.408798 -17.308878 -16.358890 -16.072008
## highest:   7.888457   8.306193   8.374410   8.431756   8.961172

##  [1]   0.02222222   9.81000000  23.49858974  41.40000000  63.37666667
##  [6]  99.13600000 158.52171429 260.89000000 333.85343333 472.62600000

There are long tails on the max dus. I’m going to cap them at 100. Then create some maps to look at both the max dus and the dissimilarity indices.

flu_dissim<- flu_dissim %>%mutate(mean_max_du_capped=replace(mean_max_du,mean_max_du>100,100))

ggplot(flu_dissim, aes(x=mean_max_du_capped)) + geom_histogram(bins=25)

Now let’s map the range of Max DU per acre.

MAX DU PER ACRE

## Reading layer `dbo.BLOCKGRP2010_NOWATER' from data source 
##   `MSSQL:server=AWS-PROD-SQL\Sockeye;database=ElmerGeo;trusted_connection=yes' 
##   using driver `MSSQLSpatial'
## Simple feature collection with 2644 features and 14 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: 1099353 ymin: -97548.53 xmax: 1622631 ymax: 477101.5
## Projected CRS: NAD83 / Washington North (ftUS)

Minority-White Dissimilarity

Looking at the allowed max du map and the Dissimiliarity map, the patterns between the two are not obvious.

c.layer$White_Minority_Dissim[is.na(c.layer$White_Minority_Dissim)]<-0
dissim_bg_map<-create_map_dissim(c.layer)
dissim_bg_map

These plots show the dissimiliarity index vs the max du. Again they show a fairly weak relationship, but it is there.

ggplot(flu_dissim, aes(x=mean_max_du_capped, y=White_Minority_Dissim))+
  geom_point()+
  geom_smooth(method=lm, formula='y ~poly(x,3)')

It looks like:

With max DUs ranging from 0 to 25, as the max DUs increases, the White-People of Color Dissimilarity decreases, and gets closer to 0, meaning less White shares, and more integration. The most integration is in the ranges of 25-75 DU per acre.

Also with max Dus ranging from 75 to 100 as the max DUs increases, the White-Black Dissimilarity decreases, and becomes more negative, meaning less White shares, but more segregation of People of Color (I think this is what it means).

If we want to define single family, we will need to use some thresholds to do so. Here is an analysis of defining single family at 10 du per acre. The average dissimiliarity index is more negative in non-single family zoned areas at -0.82 avg and more positive in single family zoned areas at 2.66 avg.

The differences are statistically significant between the groups.

What if we try a multibreak point approach: 0-3 DUs low sf, 3-6 med sf, 6-10 high sf, 10-25 low mf, 25-75 med mf, 75+ high mf?

What if we simply estimate a linear model with the categorization of single family and multi-family?

## 
## Call:
## glm(formula = White_Minority_Dissim ~ sf_mf_cats, data = flu_dissim)
## 
## Deviance Residuals: 
##      Min        1Q    Median        3Q       Max  
## -17.6188   -1.7160    0.5717    2.2982    8.7701  
## 
## Coefficients:
##                              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                    2.9230     0.2292  12.754  < 2e-16 ***
## sf_mf_catsMed_High_SF (3-10)  -1.7982     0.2788  -6.451 1.32e-10 ***
## sf_mf_catsLow_MF (10-25)      -2.7320     0.2721 -10.042  < 2e-16 ***
## sf_mf_catsMed_MF(25-75)       -3.4574     0.2578 -13.413  < 2e-16 ***
## sf_mf_catsHigh_MF(75+)        -4.0391     0.2614 -15.453  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for gaussian family taken to be 10.97836)
## 
##     Null deviance: 32375  on 2639  degrees of freedom
## Residual deviance: 28928  on 2635  degrees of freedom
## AIC: 13824
## 
## Number of Fisher Scoring iterations: 2

Now do some analysis in Seattle only between the relationship of max_du and dissimiliarity.We’ll do the correlation with max_du, the model with multiple categories.

SEATTLE ONLY ANALYSIS

## Warning in cor.test.default(flu_dissim_seattle$White_Minority_Dissim,
## flu_dissim_seattle$mean_max_du, : Cannot compute exact p-value with ties
## 
##  Spearman's rank correlation rho
## 
## data:  flu_dissim_seattle$White_Minority_Dissim and flu_dissim_seattle$mean_max_du
## S = 24003625, p-value = 1.553e-10
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
##        rho 
## -0.2861418

##  block_group_geoid10   max_max_du        min_max_du      mean_max_du     
##  Length:482          Min.   :   6.00   Min.   :  0.00   Min.   :  3.806  
##  Class :character    1st Qu.:  58.81   1st Qu.:  6.00   1st Qu.: 37.405  
##  Mode  :character    Median :  78.41   Median :  8.70   Median : 48.778  
##                      Mean   : 116.25   Mean   : 13.94   Mean   : 63.772  
##                      3rd Qu.:  98.01   3rd Qu.:  8.70   3rd Qu.: 63.032  
##                      Max.   :1012.00   Max.   :313.63   Max.   :472.626  
##    TractIDInt           TractID          White_Black_Dissim White_AIAN_Dissim
##  Min.   :5.303e+10   Min.   :5.303e+10   Min.   :-73.214    Min.   :-66.765  
##  1st Qu.:5.303e+10   1st Qu.:5.303e+10   1st Qu.: -1.786    1st Qu.:  1.147  
##  Median :5.303e+10   Median :5.303e+10   Median :  1.835    Median :  3.194  
##  Mean   :5.303e+10   Mean   :5.303e+10   Mean   : -1.221    Mean   :  1.316  
##  3rd Qu.:5.303e+10   3rd Qu.:5.303e+10   3rd Qu.:  3.631    3rd Qu.:  4.333  
##  Max.   :5.303e+10   Max.   :5.303e+10   Max.   :  9.206    Max.   :  9.925  
##  White_API_Dissim   White_Other2_Dissim White_Hispanic_Dissim
##  Min.   :-29.2731   Min.   :-15.3428    Min.   :-19.713      
##  1st Qu.: -1.8446   1st Qu.: -1.9159    1st Qu.: -0.106      
##  Median :  0.9203   Median :  0.2643    Median :  1.786      
##  Mean   : -0.5081   Mean   : -0.2228    Mean   :  1.160      
##  3rd Qu.:  2.4370   3rd Qu.:  1.9342    3rd Qu.:  3.241      
##  Max.   :  5.7983   Max.   :  5.9651    Max.   :  5.511      
##  White_Minority_Dissim Black_AIAN_Dissim  Black_API_Dissim   Black_Other_Dissim
##  Min.   :-17.30888     Min.   :-60.4931   Min.   :-28.9056   Min.   :-14.7696  
##  1st Qu.: -1.79509     1st Qu.:  0.0000   1st Qu.: -2.4830   1st Qu.: -3.4141  
##  Median :  1.09956     Median :  0.9764   Median : -0.9214   Median : -0.9709  
##  Mean   : -0.07273     Mean   :  2.5367   Mean   :  0.7131   Mean   :  0.9983  
##  3rd Qu.:  2.43892     3rd Qu.:  3.9054   3rd Qu.:  1.6020   3rd Qu.:  1.8699  
##  Max.   :  4.98170     Max.   : 73.7148   Max.   : 57.4020   Max.   : 73.2305  
##  Black_Hispanic_Dissim AIAN_Asian_Dissim  AIAN_Other_dissim
##  Min.   :-22.08406     Min.   :-32.6779   Min.   :-16.142  
##  1st Qu.: -1.48705     1st Qu.: -4.2836   1st Qu.: -4.792  
##  Median :  0.06902     Median : -2.2117   Median : -2.381  
##  Mean   :  2.38150     Mean   : -1.8236   Mean   : -1.538  
##  3rd Qu.:  2.73997     3rd Qu.: -0.7212   3rd Qu.: -0.803  
##  Max.   : 72.83610     Max.   : 63.0224   Max.   : 67.864  
##  AIAN_Hispanic_Dissim API_Other_Disim.Other.2. API_Hispanic_Dissim
##  Min.   :-22.5279     Min.   :-11.1634         Min.   :-15.5342   
##  1st Qu.: -2.7275     1st Qu.: -2.2153         1st Qu.: -0.4572   
##  Median : -1.2448     Median : -0.2190         Median :  1.0429   
##  Mean   : -0.1552     Mean   :  0.2852         Mean   :  1.6684   
##  3rd Qu.: -0.1385     3rd Qu.:  2.0905         3rd Qu.:  2.8496   
##  Max.   : 67.4497     Max.   : 32.6779         Max.   : 29.5593   
##  Other_Hispanic_Dissim mean_max_du_capped
##  Min.   :-22.2454      Min.   :  3.806   
##  1st Qu.: -0.3303      1st Qu.: 37.405   
##  Median :  1.2227      Median : 48.778   
##  Mean   :  1.3832      Mean   : 51.619   
##  3rd Qu.:  3.4788      3rd Qu.: 63.032   
##  Max.   : 14.1670      Max.   :100.000   
##                             sf_mf_cats 
##  Low_SF (less than 3 Du per acre):  0  
##  Med_High_SF (3-10)              : 19  
##  Low_MF (10-25)                  :  2  
##  Med_MF(25-75)                   :117  
##  High_MF(75+)                    :344  
## 

```